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FOREWORD 


Radar  signal  processing  with  multilayered  perceptrons  was  investigated. 
Networks  with  no  hidden  layer  and  a  single  hidden  layer  were  tested  on  field 
collected  millimeter  wave  target  returns  which  had  been  corrupted  with 
artificial  Gaussian  noise  at  a  signal  to  noise  level  of  3  dB.  Performance  as 
a  function  of  network  architecture  was  characterized. 

The  authors  would  like  to  thank  Jim  Queen  and  Karl  Krueger  for  providing 
the  data  used  in  the  study  and  for  helpful  suggestions  concerning  data  pre¬ 
processing. 

This  study  has  been  supported  by  the  Office  of  Naval  Technology  through 
the  Independent  Exploratory  Development  Program  and  was  conducted  in  the  Space 
and  Ocean  Geodesy  Branch  of  the  Strategic  Systems  Department. 

This  technical  report  has  been  reviewed  by  Patrick  E.  Beveridge,  Head  of 
the  Space  and  Ocean  Geodesy  Branch,  and  J.  Ralph  Fallin,  Head  of  the  Space  and 
Surface  Systems  Division. 
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OVERVIEW 

In  April  of  1988  work  was  begun  at  the  Space  and  Surface  Systems 
Division  of  the  Strategic  Systems  Department  at  the  Naval  Surface  Warfare 
Center  in  Dahlgren,  Virginia  on  applying  Artificial  Neural  Systems  (ANS)  to 
radar  signal  processing.  This  work  was  funded  as  part  of  the  Independent 
Exploratory  Development  research  program.  This  report  summarizes  the  current 
status  of  this  effort  as  of  October  1988. 

In  identifying  a  candidate  neural  network  for  radar  signal  processing, 
many  different  neural  network  algorithms,  paradigms,  were  considered. 
Multilayered  Perceptrons  (MLPs)  and  Carpenter  Grossberg  networks  were  chosen 
as  the  initial  paradigms  to  be  investigated.  Ease  of  implementation  lead  us 
to  employ  the  MLP  algorithm  first. 


NETWORK  TRAINING 

The  MLPs  used  for  the  study  contained  an  input  layer,  a  single  hidden  or 
no  hidden  layer,  and  an  output  layer.  The  input  patterns  were  propogated 
through  the  network  using  the  standard  equations,  and  a  sigmoid  transfer 

9 

function.  For  the  purposes  of  this  study,  no  direct  links  from  the  input  to 
the  output  layer  were  allowed. 

Next  the  edge  weight  corrections  were  computed.  The  two  methods  which 
were  used  in  the  study  for  this  purpose  are  the  standard  error  back  propaga- 

9  1 

tion  scheme,  and  error  back  propagation  with  selective  learning.  The 
selective  learning  algorithm  was  developed  in  response  to  certain  pathological 
situations  which  can  occur  when  using  the  standard  gradient  descent  algorithm. 
This  procedure  is  helpful  in  situations  where  certain  patterns  in  the  training 
set  are  sparsely  represented,  or  when  certain  output  states  of  the  network  lie 
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in  close  error  space  proximity  to  the  desired  output  state.  A  brief  descrip¬ 
tion  of  the  algorithm  is  given  below. 

On  the  first  pass  through  the  training  patterns  selective  learning  does 
not  take  place.  This  pass  serves  to  provide  the  user  with  values  of  rmax  and 
rms.  On  subsequent  passes  the  following  procedure  is  used  to  modify  the 
errors  of  the  output  nodes. 

if  (liSjl  <  mfac  *  rms)  then 

6.=efac*6  (1) 

J  j 

end  if 

Efac  is  chosen  to  lie  between  0  and  1,  and  mfac  is  chosen  to  be  slightly  less 
than  rmax/rms.  The  rest  of  the  training  procedure  proceeds  as  before.  A  more 
complete  discussion  of  selected  learning  is  provided  elsewhere.^ 


APPROACH 

In  early  1987  radar  data  was  collected  at  the  Kiernan  Reentry  Measure¬ 
ment  Site  (KREMS)  at  Kawajaleen  Atoll  in  the  Marshall  Islands,  on  several 
objects  under  various  conditions.  Millimeter  wave  radar  returns  which  were 
collected  from  a  towed  kite  were  used  for  this  study. 

Figure  1  illustrates  the  organization  of  the  data  into  blocks  consisting 
of  80  amplitude  and  phase  values  for  both  the  pp,  and  the  op  channels.  Two 
hundred  of  these  pure  signal  blocks  were  used  to  create  the  training  sets. 

The  pre-processing  steps  for  the  data  are  flow  charted  in  Figure  2.  Each  data 
pair  was  transformed  to  the  I  Q  plane,  corrupted  with  noise,  and  fast  fourier 
transformed.  Pre-processing  of  the  radar  data  was  performed  on  a  Cyber  875. 

For  each  signal  plus  noise  block,  a  pure  noise  block  was  generated  as 
follows.  Since  the  pp  channel  dominated  the  signal,  a  decision  was  made  to 
use  its  maximum  average  power  level  in  the  generation  of  both  the  noise  to  be 
added  to  both  signal  channels  and  the  pure  noise  signal.  First,  the  average 
power  value  over  all  eight  pulses  was  computed  for  each  range  bin  in  the  pp 
channel 


P.  = 


+  Q./) 


ij _ 


(2) 


2 


NSWC  TR  90-171 


80  OP  AMPLITUDE  AND  PHASE  VALUES  \ 

— ^20  scalars 

80  PP  AMPLITUDE  AND  PHASE  VALUES^ 

FIGURE  1.  DATA  BLOCK  STRUCTURE 
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FIGURE  2.  DATA  PRE-PROCESSING 
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Next  the  maximum  of  these  averages  was  computed 


‘(Pi>  Po»  Po»  •••»  Pin^ 


Pmax  ■  ^1'  •*"  *^10^ 


(3) 


Next  the  user  specified  signal  to  noise  ratio  in  decibels  was  converted  to 
2 

volts  ratios 


sn  =  10 


sndb 

10 


(4) 


Assuming  an  equal  distribution  in  both  I  and  Q  the  rms  for  the  target  signal 
was  computed 


Using  the  user  specified  signal  to  noise  ratio  an  rms  value  was  computed  for 
the  noise 


0 

noise 


(6) 


Finally  320  Gaussian  variants  with  a  mean  of  0  and  a  rms  of  o  .  were 
■'  noise 

generated  to  corrupt  the  target  block  and  a  similar  number  with  a  mean  of  0 

and  a  rms  of  a_„-  „  became  the  pure  noise  block, 
noise  ^ 

Before  training  could  begin  the  data  was  normalized  to  lie  between  0 

2  2 

and  1.  Initially  the  160  I  +  Q  values  from  a  given  block  were  normalized  by 
dividing  each  of  them  by  the  largest  I  +  Q  value  in  the  block.  200  blocks 
of  signal  in  the  presence  of  noise  were  placed  in  a  training  file  with  200 
blocks  of  pure  noise.  Their  alternating  placement  in  the  file  is  illustrated 
in  Figure  3  and  an  example  normalized  signal  plus  noise  block  is  shown  in 
Figure  4. 

Various  multilayered  perceptrons  were  trained  on  four  files  of  data  at  a 
fixed  signal-to-noise  ratio.  After  training  the  network  to  a  desired  rms 
error  level  their  performance  was  tested  on  10  files  of  data.  This  procedure 
is  summarized  in  Figure  5. 
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NORMALIZATION 


B,(N) 

B^(N) 

■ 


B(N) 

200 


COMBINED  DATA 
FILE 


B(S+N) 

200 


NETWORK 
TRAINING  FILE 


FIGURE  3.  TRAINING  FILE  CREATION 


NSWC  TR  90-171 


U3MOd 


7 


NEURON  NUMBER 


NSWC  TR  90-171 


FILE  =  200  S+N,  200  N 


TRAINING 


FILE 

1 


FILE 

2 


FILE 

3 


FILE 

4 


TRAINING 

ALGORITHM 


W 

IJ  K 


TESTING 


FILE  FILE 
13  14 


IBRAIN 


P 

d  faca 


FIGURE  5.  NETWORK  TRAINING/TESTING 
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RESULTS 

Networks  with  two,  one,  and  no  hidden  layers  were  tested  on  the  data.  A 
typical  network  architecture  of  160  input  nodes,  8  hidden  nodes  in  the  first 
hidden  layer,  4  hidden  nodes  in  the  second  hidden  Iyer,  and  2  output  nodes  is 
shown  in  Figure  6,  Networks  were  trained  to  rms  error  levels  of  around  .1 
using  a  learning  rate  of  ,1  with  a  .9  rate  of  momentum  transference.  Networks 
usually  were  able  to  master  about  96  percent  of  the  1600  training  patterns. 

All  training  and  testing  was  done  on  a  Sun  3/280  with  a  Weitek  floating  point 
accelerator. 

Performance  of  the  networks  were  characterized  by  P^j*  and  p^.^. 

Network  response  was  determined  using  a  threshold  t  as  follows: 
if  (output  of  first  output  neuron)  >  t  then 
signal  +  noise  is  present 
else 

noise  is  present 
endif , 

Using  this  criterion  the  correctness  of  the  networks  response  could  be 
evaluated. 

Performance  of  the  networks  as  a  function  of  architecture  was  studied 
extensively  using  the  3-dB  data.  various  networks  with  0  and  1 

hidden  layers  are  summarized  in  Table  1,  Pfg®  and  P^s  for  the  same  networks 
are  summarized  in  Table  2.  Results  for  those  networks  trained  using  exponen¬ 
tial  decay  were  virtually  identical  to  the  results  obtained  using  straight 
back  propagation. 


CONCLUSIONS 

The  highest  p^^  was  obtained  by  a  network  with  no  hidden  layer  using  a 
threshold  value  of  .5.  This  seems  to  indicate  that  the  additional  hidden 
layer  in  the  other  networks  was  trained  in  an  ineffective  manner.  This 
problem  could  be  caused  by  an  insufficient  number  of  training  examples  to 
converge  the  edge  weights,  loss  of  some  of  the  signals  salient  feature  through 
use  of  the  normalization  procedure,  sensitivity  of  edge  weight  convergence  to 
initialization,  or  failure  to  use  small  enough  learning  rates  in  training. 
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(1,0)  =  Signal  +  Noise 
(0,1)  =  Noise 

N.  B.  -  Not  all  connections  have  been  shown. 

FIGURE  6.  SAMPLE  NETWORK  ARCHITECTURE 
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TABLE  1. 


Number  of 

Hidden _ 


ON  3-dB  DATA  FOR  NO/SINGLE  HIDDEN  LAYER  PERCEPTRONS 
ca 


Selective 

PCA  (Probability  of 

(Threshold) 

Learning 

Correct  Answer) 

.5 

no 

.9482 

.5 

no 

.9360 

.5 

no 

.9362 

.5 

no 

.9367 

.9 

no 

.8462 

.9 

no 

.9350 

.9 

no 

.9320 

.9 

no 

.9282 

.5 

yes 

.9475 

.5 

yes 

.9365 

.5 

yes 

.9475 

.5 

yes 

.9452 

.9 

yes 

.7815 

.9 

yes 

.9045 

.9 

yes 

.9320 

.9 

yes 

.9005 
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TABLE  2. 

Pfa‘  Pd  ON  3-dB 

DATA  FOR 

NO/SINGLE  HIDDEN  LAYER 

PERCEPTRONS 

PFA 

PD 

Number  of 
Hidden 

^ (Threshold) 

Selective 

T.e«rnin«> 

(Probability  of 
False  Alarm) 

(Probability  of 
Detection) 

0 

.5 

no 

.0380 

.9345 

6 

.5 

no 

.0700 

.9420 

12 

.5 

no 

.0600 

.9325 

2k 

.5 

no 

.0565 

.9300 

0 

.9 

no 

.0040 

.6965 

6 

.9 

no 

.0450 

.9150 

12 

.9 

no 

.0330 

.8970 

Ik 

.9 

no 

.0305 

.8870 

0 

.5 

yes 

.0350 

.9300 

6 

.5 

yes 

.0730 

.9460 

12 

.5 

yes 

.0400 

.9350 

Ik 

.5 

yes 

.0480 

.9385 

0 

.9 

yes 

.0015 

.5645 

6 

.9 

yes 

.0185 

.8275 

12 

.9 

yes 

.0330 

.8970 

Ik 

.9 

yes 

.0080 

.8090 
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Alternate  criterion  for  the  selection  of  the  "best”  network  use  the 
and  factors.  If  we  require  that  the  network  have  a  p^^  less  than  ,01  and 
a  p^  as  large  as  possible  then  the  network  with  24  hidden  nodes  which  was 
trained  using  selective  learning  and  tested  with  a  threshold  of  .9  is  the  best 
one.  The  network  with  no  hidden  layer"  which  was  trained  using  selective 
learning  and  tested  with  a  threshold  of  .9  is  the  best  network  if  lowest  p^^ 
is  the  only  requirement.  This  network’s  low  p^^  is  offset  by  the  fact  that  it 
thinks  half  of  the  signal  plus  noise  examples  are  pure  noise. 

Some  ideas  to  achieve  improved  edge  weight  utilization  are  as  follows: 

1.  single  pulse  networks  with  voting, 

2.  voting  with  multi-architecture  networks, 

3.  nonlinear  normalization  procedures, 

4.  network  training  with  I  Q  pairs  after  FFT,  and 

5.  network  training  with  I  Q  pairs  before  FFT. 

Some  preliminary  work  has  been  done  on  these  ideas  but  further  work  is  needed 
to  fully  evaluate  their  potential.  Along  with  evaluating  these  ideas  future 
work  will  focus  on  target  detection  in  clutter. 
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GLOSSARY 


rms 

mfac 

efac 


Pi 

Pmax 

sndb 

sn 

°tar 

^noise 

Pfa 


root  mean  square  error  for  all  the  output  nodes  over  all 
the  patterns  for  a  given  pass 

user  supplied  control  parameter  used  in  selected  learning 

user  supplied  scaling  factor  used  in  selective  learning 

I,  Q  pair  associated  with  the  ith  range  bin  on  the  jth  pulse 

average  power  value  for  the  ith  range  bin  over  all  the  pulses 

maximum  of  all  the  p^s  over  all  the  bins 

signal  to  noise  ratio  in  decibels 

signal  to  noise  ratio  in  volts 

rms  for  target  signal 

rms  for  noise 

probability  of  false  alarm,  the  probability  that  a  network 
incorrectly  identifies  a  noise  return  as  target 

probability  of  detection,  the  probability  that  a  network 
correctly  identifies  a  target  return 


=  probability  of  correct  answer,  the  probability  that  a  network 
correctly  identifies  a  return 


T  =  threshold  on  the  first  output  neuron  which  determines 
classification 
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